Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance the failover logic for balance procedure. #2232

Merged
merged 3 commits into from
Aug 5, 2020

Conversation

dangleptr
Copy link
Contributor

@dangleptr dangleptr commented Jul 15, 2020

  1. For leaders and followers, check the state for each part before balance. It gives a chance to recover from the last failed state.
  2. For learners, check the peers and their state when opening.

During chaos testing, we found some problems when the opening part is too slow. It always causes the members in raft group mix up its peers, and leader election always failed.

To recover from it, we have this PR.

@dangleptr dangleptr added the ready-for-testing PR: ready for the CI test label Jul 15, 2020
@jude-zhu jude-zhu requested review from critical27 and liuyu85cn July 16, 2020 03:18
critical27
critical27 previously approved these changes Jul 20, 2020
Copy link
Contributor

@critical27 critical27 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well done!

@dangleptr
Copy link
Contributor Author

Pleas hold on for a while.

@dangleptr dangleptr removed the ready-for-testing PR: ready for the CI test label Jul 24, 2020
@dangleptr dangleptr added the ready-for-testing PR: ready for the CI test label Jul 24, 2020
@dangleptr dangleptr changed the title Check peers when added part already existed Enhance the failover logic for balance procedure. Jul 24, 2020
critical27
critical27 previously approved these changes Jul 27, 2020
Copy link
Contributor

@critical27 critical27 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks to THANOS.

Copy link
Contributor

@liuyu85cn liuyu85cn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great THANOS

@dangleptr dangleptr merged commit 28ddf2f into vesoft-inc:master Aug 5, 2020
xuguruogu pushed a commit to xuguruogu/nebula that referenced this pull request Sep 12, 2020
* Check peers when added part already existed

* Fix bug about catchup

Co-authored-by: heng <[email protected]>
tong-hao pushed a commit to tong-hao/nebula that referenced this pull request Jun 1, 2021
* Check peers when added part already existed

* Fix bug about catchup

Co-authored-by: heng <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready-for-testing PR: ready for the CI test
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants